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Abstract. We investigate the possibility of detecting the 3D cross correlation power spec¬ 
trum of the Ly-a forest and HI 21 cm signal from the post reionization epoch. The cross¬ 
correlation signal is directly dependent on the dark matter power spectrum and is sensitive to 
the 21-cm brightness temperature and Ly-a forest biases. These bias parameters dictate the 
strength of anisotropy in redshift space. We find that the cross-correlation power spectrum 
can be detected using 400 hrs observation with SKA-mid (phase 1) and a futuristic BOSS 
like experiment with a quasar (QSO) density of 30 deg^ 2 at a peak SNR of 15 for a single 
field experiment at redshift z = 2.5. We also study the possibility of constraining various 
bias parameters using the cross power spectrum. We find that with the same experiment 
lu conditional errors on the 21-cm linear redshift space distortion parameter /3t and /3j r 
corresponding to the Ly-a forest are ~ 2.7% and ~ 1.4% respectively for 10 independent 
pointings of the SKA-mid (phase 1). This prediction indicates a significant improvement 
over existing measurements. We claim that the detection of the 3D cross correlation power 
spectrum will not only ascertain the cosmological origin of the signal in presence of astro- 
physical foregrounds but will also provide stringent constraints on large scale HI biases. This 
provides an independent probe towards understanding cosmological structure formation. 
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1 Introduction 

Intensity mapping of the neutral hydrogen (HI) distribution using observations of redshifted 
21-cm radiation is a potentially powerful probe of the large scale structure of the universe and 
the background expansion history the post reionization era [1-4] (also see [5] for a review). 
The epoch of reionization is believed to be completed by redshift z ~ 6 [6]. Following this 
era of phase transition, dense self shielded Damped Ly-a (DLA) systems contain bulk of the 
HI gas. These DLA systems are believed to be the dominant source of the HI 21-cm signal in 
the post reionization era. Mapping the collective HI 21-cm radiation without resolving the 
individual DLAs is expected to yield enormous astrophysical and cosmological information 
regarding the large scale matter distribution, galaxy formation, and expansion history of the 
Universe in the post-reionization era [4, 7-10]. 

In the same epoch, HI in the dominantly ionized inter galactic medium (IGM) produces 
distinct absorption features in the spectra of background QSOs [11]. The Ly-a forest, maps 
out the HI density fluctuation field along one dimensional skewers which correspond to QSO 
sight lines. On suitable large cosmological scales both the Ly-a forest and the redshifted 
21-cm signal are, however, believed to be biased tracers of the underlying dark matter (DM) 
distribution [12-15]. Hence, the clustering property of these signals is directly related to the 
dark matter power spectrum and the cosmological parameters. Like the HI 21-cm signal, 
Ly-a forest observations also find a host of cosmological applications such as measurement of 
matter power spectrum [16], cosmological parameters [17, 18], limits on neutrino mass [19], 
constraints on the dark energy [20], reionization history [21] etc. Several Radio interfero¬ 
metric arrays like the Giant Metrewave Radio Telescope (GMRT) 1 , the Ooty Wide Field 
Array (OWFA) [22], the Canadian Hydrogen Intensity Mapping Experiment (CHIME) 2 , the 
Meer-Karoo Array Telescope (MeerKAT) 3 4 , the Square Kilometer Array (SKA) 1 are being 
designed and are aimed towards observing the background 21-cm radiation for astrophysi¬ 
cal and cosmological investigations. On the other hand, there has been the recent Baryon 

1 http://gmrt.ncra.tifr.res.in/ 

2 http://chime.phas.ubc.ca/ 

3 http://www.ska.ac.za/meerkat/ 

4 https://www.skatelescope.org/ 
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Oscillation Spectroscopic Survey (BOSS) 5 aimed towards probing dark energy and cosmic 
acceleration through measurements of the large scale structure and the BAO signature in the 
Ly-a forest [23]. The availability of high signal to noise ratio (SNR) Ly-a forest spectra for 
a large number of QSOs from the BOSS survey allows 3D statistics to be done with Ly-a 
forest data [24, 25]. 

Detection of these signals with high statistical significance is confronted by several 
observational challenges. For the HI 21-cm observations, the signal is extremely weak intrin¬ 
sically as compared to the large foregrounds from galactic and extra-galactic sources [26-28] . 
This inhibits a simple detection. Further, calibration errors and man made radio frequency 
interferences plague the signal. A statistical detection of the signal requires careful analy¬ 
sis of observational errors and precise subtraction of the foregrounds [29, 30]. The various 
difficulties faced by Ly-a observations include proper modeling and subtraction of the con¬ 
tinuum, flux, incorporating the fluctuations of the ionizing source, uncertainties in the IGM 
temperature-density relation [31] and contamination of the spectra by metal lines [32]. 

The two signals being tracers of the underlying large scale structure are expected to be 
correlated on large scales. However foregrounds and other systematics from two distinct ex¬ 
periments are believed to be uncorrelated between the two independent observations. Hence, 
the cross correlation signal if detected is more likely to ascertain its cosmological origin. The 
2D and 3D cross correlation of the HI 21-cm signal with other tracers of the large scale 
structure such as the Ly-a forest and the Lyman break galaxies have been proposed as a way 
to avoid some of the observational issues [33, 35]. The effectiveness of the cross-correlation 
technique has been demonstrated by successful detection of the HI 21-cm emission at redshift 
~ 0.8 using cross correlations of HI 21-cm maps and galaxies [36]. It is important to note that 
the foregrounds in HI 21-cm observations appear as noise in the cross correlation and hence, 
a certain degree of foreground cleaning is still required for a statistically feasible detection. 

The study of large scale correlation of Ly-a forest [25] has reinforced the belief that 
the Ly-a forest traces the dark matter. Further, the cross correlation of DLAs and Ly- 
a forest has been used to measure the DLA bias [37]. While CMBR observations have 
been able to precisely constrain the cosmological parameters, a study of the neutral IGM 
requires strong constraints on the bias parameters. These biases are largely investigated 
in numerical simulations. Recent measurement of Ly-a forest parameters using the BOSS 
survey [25] is found to be significantly different from those obtained from simulation [12]. 
Precise constraints on these parameters are extremely important towards understanding the 
nature of clustering of the IGM and the physics of structure formation. This motivates us to 
investigate the possibility of measuring large scale HI bias using the cross-correlation of the 
Ly-a forest with 21-cm signal from the post-reionization epoch. 

In this paper we first consider the 3D cross power spectrum of the HI 21-cm maps and 
large scale Ly-a forest. We discuss the possibility of detecting the signal using the upcoming 
SKA-mid phasel (SKAl-mid) like telescopes and future Ly-a forest surveys with very high 
QSO number densities. Finally we consider the possibility of estimating cosmological param¬ 
eters using such measurements. The fiducial model is chosen to be the ACDM spatially flat 
cosmology with parameters taken from the WMAP 7 [38]. The cosmological parameter Da is 
the only free parameter for a flat ACDM model. We choose this along with the redshift space 
parameters for the 21-cm signal and Ly-a observations. Further, the global amplitude of the 
cross-power spectrum is sensitive to the HI neutral fraction in the post reionization epoch and 

5 https: / /www. sdss3 .org/surveys/boss, php 
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is chosen as a free parameter. Noting that cosmological parameters are well constrained from 
CMBR data, our primary focus in this work is to investigate how the upcoming observations 
may put constraints on the bias parameters for the Ly-a forest and 21-cm signal. 


2 Formulation 


2.1 The Ly -a forest and redshifted 21 cm signal from the post-reionization 
epoch 

In the post reionization epoch, small fluctuations in the HI density field in the IGM which 
is largely ionized, reveal as distinct absorption features in the spectra of background QSOs 
known as the Ly-a forest. Here, the quantity of observational interest is the transmitted flux 
& through the Ly-a forest. The gas is believed to trace the underlying dark matter distri¬ 
bution [12, 25] on large scales where pressure plays a minor role. The neutral fraction is also 
assumed to be maintained at a constant value in the IGM owing to photo-ionization equilib¬ 
rium. This leads to a power law temperature-density relation [39, 40]. These assumptions are 
incorporated in the fluctuating Gunn-Peterson approximation [41] relating the transmitted 
flux to the dark matter over-density 6 as, 




exp(—r) = exp 


-A( 1 + ,5) 2 -o.7(7-i) 


( 2 . 1 ) 


where (7 — 1) denotes the slope of the power law temperature-density relation [31, 40] and 
is sensitive to the reionization history of the Universe. The parameter A ~ 1 [32] varies with 
redshift and depends on a host of astrophysical and cosmological parameters, like the photo¬ 
ionization rate, IGM temperature, and parameters controlling the background cosmological 
evolution [16]. It is however reasonable to assume that = (&/& — l) oc 5 with the 
assumption that the Ly-a forest spectrum has been smoothed over some reasonably large 
length scale [42, 43]. This linearized relation facilitates analytic computation of the statistical 
properties of <5jr. We note that dominant corrections to this on small scales come from 
peculiar velocities. 

The range of redshifts that can be probed using the Ly-a forest can also be probed using 
the HI 21-cm emission signal. However, unlike the Ly-a forest which arises from the low 
density HI residing in the IGM, the 21-cm emission from the same epoch will be dominated 
by DLAs which are believed to contain most of the HI during the post reionization era. 
Nevertheless, HI 21-cm signal is likely to trace the underlying DM distribution on large 
scales of our interest. We use 5t to denote the redshifted 21-cm brightness temperature 
fluctuations. 

We write and St in Fourier space as 


/ d 3 k 

A «< k >- < 2 - 2 ) 

where a = ^ and T refer to the Ly-a forest transmitted flux and 21-cm brightness temper¬ 
ature respectively. With all the assumptions discussed above and incorporating the effect of 
peculiar motion through the redshift space distortion, we may write 

A a (k) = C a [l + /3 oM 2 ]A(k) (2.3) 
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where A(k) is the dark matter density contrast in Fourier space and ju is the cosine of the 
angle between the line of sight direction n and the wave vector (/j, = k • n). /3 a is the linear 
redshift distortion parameter. 

For the 21-crn brightness temperature field we have 

C T = 4.0mKl r % I (l + ! ) ! ( ! ^-) (x) (i^)) (2 ' 4) 

where xhi is the mean neutral fraction. The neutral hydrogen fraction is assumed to be a con¬ 
stant with a value xhi = 2.45 x 10 ~ 2 obtained from the measurement of £l gas ~ 1(T 3 [45-48], 
In the case of HI 21-cm signal the parameter / 3x, which is known as linear redshift distortion 
parameter, can be written as the ratio between the growth rate of linear perturbations f(z) 
and the HI bias bx- The bias function bx(k,z) has an intrinsic scale dependence below the 
Jeans scale and an implicit scale dependence arising from the fluctuations in the ionizing 
background [2]. Moreover, it has been shown [44] that this bias is a monotonically growing 
function of redshift. The assumption of linear bias is supported by numerical simulations 
[13, 14] which indicate that over a wide range of scales, a constant bias model is adequate to 
describe the distribution of neutral gas for z < 3. Recent measurement of DLA bias is also 
consistent with the constant bias model [37]. We find that f(z) « 1 at the fiducial redshift 
of interest z = 2.5. We adopt a constant bias bx = 2 which is consistent with recent results 
from numerical simulations of HI 21-cm signal in the post-reionization epoch [13-15]. This 
gives the value of the parameter f3x ~ 0.5. 

The linear distortion parameter for the Ly-a forest, denoted by /3jr, can not be inter¬ 
preted in the same way as fix- This is because of the non-linear relationship between the 
observed Ly-a transmitted flux and the underlying DM density field [25]. The bias factor 
for the forest is the bias of the contrast of the fluctuations in the flux and is not same as 
the HI bias. Unlike the HI 21-cm signal, the parameters (Cjr,/3jr) are independent of each 
other and are dependent on the model parameters A, 7 and the flux probability distribution 
function (PDF) of the Ly-a forest. In the absence of primordial non-gaussianity, fluctuations 
in the Ly-a flux can be well described by a linear theory with a scale independent bias on 
large scales. This is supported by numerical simulations [12]. The values of /3j? obtained in 
particle-nresh (PM) simulations demonstrate that simulations with lower resolution (larger 
smoothing length) yield lower values of f3^. The smoothing scale is ideally set by the Jean’s 
scale which is sensitive to the temperature history of the IGM. 

We adopt an approximate values (Cjr,/3jr) ~ (—0.15,1.11) from the numerical simu¬ 
lations of Ly-a forest [12]. We note that for cross-correlation studies the Ly-a forest has 
to be smoothed to the resolution of the HI 21 cm frequency channels. On these smoothed 
scales the linear bias model is well tested in simulations. Results of full hydrodynamical 
simulations are expected to yield information at smaller scales which are not necessary for 
the present analysis. We, however, note that these bias values have large uncertainties owing 
to the lack of accurate modeling of the IGM. The redshift space distortion parameter /3j r, is 
also sensitive to the probing redshift. The large scale correlation of Ly-a transmitted flux 
from BOSS survey [25] shows that the parameters (Cjr,/3j?) are significantly different from 
the above values obtained from simulations. However it has been suggested that metal line 
contamination and correct treatment of DLAs may explain this discrepancy. In this paper 
we shall stick to the values obtained from simulation [ 12 ]. 
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2.2 Cross-correlation power spectrum 

The possibility that both the Ly-a forest and the HI 21-cm signal from the post reionization 
epoch trace the underlying dark matter density field on large scales, motivates us to investi¬ 
gate their cross-correlation signal [33] . Though the respective auto-correlation power spectra 
can independently put constraints on various astrophysical and cosmological parameters there 
are several advantages of cross-correlating the signals. 

The main advantage of cross-correlation is that the issue of foregrounds and other 
systematics can be coped with greater ease as compared to the auto correlation. Even the 
smallest foreground residual will plague the auto-correlation signal. The cosmological origin 
of the signal can only be ascertained if it is detected with statistical significance in cross¬ 
correlation. Further, a joint analysis of two data sets would involve not only the individual 
auto-correlation but also the cross-correlation information. Sometimes the two independent 
probes focus on specific complimentary Fourier modes with high SNR whereby the cross 
signal takes advantage of both the probes simultaneously. This has been studied in the 
context of BAO where, owing to the difference in values of the parameters /3t and j3je the 
two probes have different sensitivities to radial and transverse clustering [34]. 

It is true that if the observations of the independent probes are perfect measurements 
no new information can be obtained from the cross correlation. However, the first generation 
measurements of the HI 21 cm signal are expected to be noisy and shall have systematic 
errors. For a detection of the 21 cm signal these measurements can in principle be cross- 
correlated against a high SNR Ly-a forest signal for cosmological investigations which may 
not be possible with the low quality auto correlation analysis. 

We consider the power spectrum of 21-cm signal, the Ly-a forest and the cross correla¬ 
tion power spectrum in three dimensions. The general 3-D power spectra for the two fields 
are defined as 

( A a (k)A ft *(k') ) = (27r) 3 <5 3 (k - k')P a6 (k) (2.5) 

where a, b can be & and T. In general the power spectrum can be written in redshift space 
as, 

P ab ( k) = C a C b ( 1 + /? a /i 2 )(l + /V)P(k) (2.6) 

where P(k) is the matter power spectrum. The auto-correlation power spectrum corresponds 
to a = b and the cross-correlation power spectrum corresponds to a ^ b. 

The cross-correlation can be computed only in the region of overlap between the ob¬ 
served Ly-a forest and 21-cm fields. However, we note that in real observations the Ly-a 
forest surveys are likely to cover much larger volume than the single field radio observations 
of the HI 21-cm signal. We consider such an overlap volume Y consisting of a patch of 
angular extent 6 a x 6 a on the sky plane and of thickness L along the line of sight direction. 
We consider the flat sky approximation. This amounts to writing the comoving separation 
vector r as 

—* dv 

r = r u 0 + h—v (2.7) 

dv 

where r v is the comoving distance corresponding to the observing frequency v and 6 is a 
2D vector on the sky plane. If B denotes the bandwidth of the 21-cm observation, we 
have L = Bdr/dv and Y = r 2 (9 2 L. The observed 21-cm signal (after significant foreground 
cleaning) in Fourier space is written as 

Aro(k) = Ar(k) + A jvr(k), (2.8) 
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where Ajvt is the corresponding noise. The radio observations measure visibilities as a func¬ 
tion of the 2D baseline vector U and frequency v. We have k = (kj_, ku) = (27tU/?v, 2tttcIu / dr) 
where r is the Fourier conjugate variable corresponding to v. The 21-cm brightness temper¬ 
ature fluctuations in Fourier space are closely related to the measured Visibilities. 

The Ly-a flux fluctuations are written as a held in the 3-D space as hjr(r). In reality 
one has the observed quantity Sj? 0 (r) which consists of the continuous held sampled along 
skewers corresponding to QSO sight lines. We, hence have <5jr 0 (r) = <5jr(r) x p( r), where the 
sampling function p( r) is dehned as 


(r) = Ei w i ip(r± - r±„) 
L E* m 


(2.9) 


and is normalized to unity ( f dVp(r) = 1 ). The index V goes up to Nq. the total number 
of QSOs considered. The weights Wi introduced in the definition of p are chosen so as to 
minimize the variance. The suitable choice of the Wi takes care of the fact that the pixel 
noise for each of the QSO spectra are in principle different. We have, in Fourier space, 


Ajr 0 (k) = p( k) <g> Ajr(k) + A N <?(k), (2.10) 

where p is the Fourier transform of p, (g) denotes a convolution and Ajv & (k) denotes a noise 
term. 

We define the cross-correlation estimator S as 

S=\ [Ajr 0 (k)A?r 0 (k) + A> 0 (k)A To (k)]. (2.11) 

We are interested in the statistical properties of this estimator. Using the definitions of 
Ajr 0 (k) and Ar 0 (k), we obtain the expectation value of $. Simple algebraic manipulation 
yields 

( £ ) = PM k). (2.12) 

Thus, the estimator is unbiased and its expectation value faithfully returns the quantity we 
are probing, namely the 3-D cross-correlation power spectrum P,^t{ k). We have assumed 
that the different noises are uncorrelated. Further, we note that the QSOs are distributed at 
a redshift different from rest of the quantities and hence p shall be uncorrelated with both 
A t and Ajr. 


2.3 Variance of the estimator and Fisher Matrix analysis 

The variance of the estimator S’, defined as, = ( S 2 ) — (S) gives 

& 

= 2^ T (k) 2 + 2 + P¥&(k\\)Pv? + 

x [Ptt( k) + Nt] ■ (2-13) 

The quantity P}p^{k\\) is known as the aliasing term and is the usual 1-D Ly-a flux power 
spectrum of the individual spectra given by 


PfM ||) = j d 2 k ± PMV- 


(2.14) 



The quantity P^p denotes the power spectrum of the weight function. The quantities Nt and 
Nj; denote the effective noise power spectra for the 21-cm and Ly-a observations respectively. 
Writing k = (kj_, fen) with |kj_| = k± = k sin# and ku = kcosO, we have 


5P&t{ k) 


a i 

VNm{k, e) 


(2.15) 


where N m (k, 9) is number of observable modes in between k to k + dk and 9 to 9 + dO given 
by 


N m (k,6) 


2-Kk 2 '1 / sin 9 dk d9 

(27r) 3 


(2.16) 


For the Ly-a forest one may choose the weights Wi of the inverse variance form [49]. However, 
an uniform weighing scheme suffices when most of the spectra are measured with a sufficiently 
high SNR [50]. This gives P 2D = i. where n is the 2D density of QSOs (h = Nq/s/). We 
assume that the variance (t^ n of the pixel noise contribution to is the same across 
all the QSO spectra whereby we have IVjr = v% N /n for its noise power spectrum. In 
arriving at equation (2.13) we have ignored the effect of QSO clustering. In reality, the 
distribution of QSOs is expected to exhibit clustering. The clustering would enhance the 
term {P^(k\\)P^ + N^) in equation (2.13) by a factor (1 + nCQ(kj_)), where Cq (kj_) is 
the angular power spectrum of the QSOs. However, for the QSO surveys under consideration, 
the Poisson noise dominates over the clustering term and the latter may be ignored. 

We consider a radio-interferometric measurement of the 21-cm signal whereby the in¬ 
strumental sensitivity per (k) mode to the redshifted 21-cm power spectrum at an observed 
frequency is = 1420/(1 + z)MHz can be calculated using the relation [51] 


N T (k, is) 


P'sys (X 2 V rlL 
Bt 0 \A e ) n b (U, is )' 


(2.17) 


Here, T sys denotes the system temperature. B and L are, as described before, the observation 
bandwidth and comoving length corresponding to the bandwidth B respectively, to is the 
total observation time, r v is the comoving distance to the redshift z , n b (U, is) is the number 
density of baseline U, where U = k±r„/2ir, and A e is the effective collecting area for each 
individual antenna. We may write n b (U, is) as 


n b (U,v) = N{N 2 1) f2D(U,u), 


(2.18) 


where N is the total number of antennae in the radio array and d{U, is) is the normalized 
baseline distribution function which follows the normalization condition f d 2 U /2d{U, is) = 1 . 
The Ly-a forest flux and the 21-cm signal are modeled using parameters (C a ,/3 a )- The 
quantity A = CtC& appears as a single overall constant and is largely uncertain. We note 
that the parameter C a also includes the bias parameter b a . Additionally, we consider the 
possibility of constraining this amplitude, two linear redshift space distortion parameters i.e, 
/3t and Pf along with the cosmological parameter Ha assuming a flat ACDM model i.e, 
Ha + H m = 1. We label these 4 parameters as A r . The Fisher matrix used for estimation of 
parameters is given by the 4x4 matrix 


Fr. 


1 / 8P^t\ (8P^t\ 2iryk 2 dkdfi 
V dA r ) V d\ s ) (2vr) 3 


(2.19) 
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21-cm auto—correlation power spectrum 


Cross—correlation power spectrum 



Figure 1. Figure shows the 3D power spectrum in redshift space at a fiducial redshift z = 2.5. 
The left panel shows the 21-crn power spectrum = /c 3 Ptt( k)/ 27 r 2 and the right panel shows the 
cross-correlation power spectrum = k 3 PT^(F)/2n 2 . The redshift space distortion appears as 
deviations from spherical symmetry of the power spectrum. 


where cry. is obtained from equation (2.13). The marginalized error on the i th parameter AA* 


is given by the Cramer-Rao bound; AAj = y/ 


F, 


-i 


This gives the theoretical bound for the error in a given parameter. If the errors are 
correlated, then in the space of the parameters we shall have error contours corresponding to 
the significance at which statistical detection is sought. Assuming the Cramer-Rao bound, 
the error contours are expected to be elliptic whose areas measure the figure of merit, and the 
orientation of the principal axis measures the strength of correlation between the parameters. 


3 Results 

3.1 Power spectrum and its detection 

The left panel of the figure (1) shows the dimensionless 21-crn power spectrum (Ay(fcj_, k\\) = 
/c 3 Pp(/c_L, /c||)/27r 2 ) in redshift space at the fiducial redshift z = 2.5. In the range of modes 
of interest 10 -2 < k < IMpc -1 the signal varies in the range 10~ 3 < A^ < lmK 2 . We find 
that in the plane of and k_ j_ the power spectrum is not circularly symmetric. The degree of 
asymmetry is sensitive to the redshift space distortion parameter. Thus, for a given k-mode 
the power spectrum differs for different sets of (k±,k\\) values. The right panel of figure (1) 
shows the 21-cm and Ly-a cross-power spectrum. The departure from spherical symmetry 
is noted here as well. However, the magnitude and scale dependence of the distortion is 
different from the 21-cm auto power spectrum owing to the difference between the linear 
distortion parameters jH for Ly-a forest and the 21-cm signal. We note that anisotropies in 
the power spectrum are determined by linear distortion parameters f3. The quantity /3p is 
~ 3 times larger than (3t■ Hence, the cross-correlation power spectrum is more anisotropic 
than the HI 21-cm auto correlation power spectrum. The cross correlation signal varies in 
the range 10 -4 < A 2 < 10” 1 mK in the fc-range 10~ 2 < k < 1 Mpc -1 . 
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We have chosen a fiducial redshift of z = 2.5 for our analysis. This is justified since 
the QSO distribution is known to peak in the redshift range 2 < z < 3. Further, to avoid 
metal line contamination and the effect of the QSO stromgen sphere, only a part of the QSO 
spectra is to be considered. At the fiducial redshift this corresponds to approximately a 
redshift band Az ~ 0.4. The cross-correlation can however only be computed in the region 
of overlap between the 21-cm signal and the Ly-a forest field. This is dictated by whichever 
is smaller - the band width of the 21-cm signal or the redshift range over which one has the 
Ly-a spectra. 

The details of the radio interferometer specifications used here, other than the baseline 
distribution function can be found in a recent paper [15]. In brief, we consider a radio 
interferometric array for the 21-cm observations mimicking the SKAl-mid. The SKAl-mid 
is one of the three different instruments that will be built as a part of the SKA telescope. 
Although these specifications may undergo changes, we use the specifications provided in the 
‘Baseline Design Document’ 6 . 

We now describe the telescope specifications used for our analysis. We consider an 
operational frequency range of 350 MHz to 14 GHz. We assume a total of 250 dish like an¬ 
tennae each of ~ 15m diameter '. To calculate the normalized baseline distribution function 
f 2 D(U,v) we use the baseline density provided in [35] (blue line in their Fig. 6). The blue 
line is essentially proportional to fi d(U) for a given frequency v. We use the normalization 
condition f d 2 U/ 2 d{U, v) = 1 to calculate the normalization factor. We note that the base¬ 
line distribution is centrally condensed with 40%, 55%, 70%, and 100% of the total antennae 
are within 0.35 km, 1 km, 2.5 km, and 100 km radius respectively. We also assume that there 
is no baseline coverage below 30m. Centrally condensed baseline coverage helps to achieve 
sufficient power spectrum sensitivity at large scales. The coverage is poor at very small 
scales (large U). However, owing to non-linear effects the modeling of the power spectrum 
at these small scales is anyway quite incomplete. We plug everything in Eq. 2.17 to obtain 
the required noise error in the 21 cm power spectrum for a given (k±. fcn). Then, for a given 
bin ( k± + dk±, k» + dk\\) we calculate the total number of independent modes N c and reduce 
the noise rms. by a factor of \/N~ c . We assume T sys to be 30K for the redshifts z = 2.5 
corresponding to observational frequency of 405.7 MHz. We also consider observations over 
32 MHz bandwidth and a typical antenna efficiency equal to 0.7 . 

We note that for a QSO at a given redshift, the region 10,000 km s^ 1 blue-wards of the 
QSO’s Ly-ct emission has to be excluded from the Ly-a forest to avoid the QSO’s proximity 
effect. Further, at least 1,000 kins -1 red-ward of the QSO’s Ly -f5 and O-VI lines may be 
discarded to avoid any confusion with the Ly -/3 forest or the intrinsic O-VI absorption. For 
example a QSO at a fiducial redshift 2.5, this would allow the Ly-a forest to be measured 
in the redshift range 1.96 < z < 2.39 spanning an interval A z = 0.43. It is necessary to 
consider a survey with a higher QSO density for the cross correlation SNR to be competitive 
with that of the 21-cm auto-correlation. We consider a BOSS-like QSO survey with a QSO 
density of 30 deg^ 2 which are measured at average 2 a sensitivity. Though the QSO surveys 

6 available here: https://www.skatelescope.org/key-documents/ 

' A recent document “SKA Level 1 Requirements (revision 6)” (https://www.skatelescope.org/key- 
documents/) has indicated that only ~ 50% of these antennae may be deployed. To achieve the same label of 
accuracy presented in this work with this degraded design one has to approximately consider 4 times the total 
observation time projected in this paper. However, this is a naive scaling and redesigned baseline distribution 
may change the entire analysis. For example, if the reduction is only of the large baseline antennae then the 
sensitivities presented here may not be that severely affected. 
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cover a large portion of the sky, ~ 10,000deg 2 , the cross correlation can only be computed 
in the region of overlap of the 21-crn and Ly-a forest survey. 


21—cm auto—correlation 


Cross—correlation 




Figure 2. SNR contours for the 21-crn auto-correlation (left panel) and the cross correlation (right 
panel) power spectrum in redshift space at the fiducial redshift z = 2.5 for a 400hrs observation at 
405MHz assuming that complete foreground removal is done. 

We first consider the idealized situation where the foregrounds in the 21-cm observa¬ 
tions are completely absent. This means that a perfect foreground subtraction has been 
achieved. The left panel of the figure (2) shows the SNR contours for the 21-cm auto corre¬ 
lation power spectrum for a 400 hrs observation and total 32MHz bandwidth at a frequency 
405.7 MHz corresponding to z = 2.5 in this idealized situation. We have taken a bin size 
to be (Ak,A9) = (k/ 5 , 7 r/ 10 ). The SNR reaches at the peak (> 20) at intermediate value 
of (k±,k») = (0.4, 0.4) Mpc -1 and falls off at both lower and higher k- values. We find that 
5(7 is possible in the range 0.08 < k± < 0.6 Mpc -1 and 0.1 < fcn < 1.5 Mpc -1 . The similar 
range for the 10<7 detection is 0.12 < k± < 0.5 Mpc -1 and 0.2 < /cy < 1.2 Mpc -1 . At lower 
values of k, the noise is dominated by cosmic variance whereas, the noise is predominantly 
instrumental at large k. However, lower cut-off for the fen arises from the limited bandwidth 
which we assume to be 32 MHz in this work. The presence of redshift space distortion shows 
up as an asymmetry in the sensitivity contours in the (fcii,fcj_) plane. The enhanced radial 
clustering due to redshift space distortion effect manifests as higher SNR ( say >10) for a 
larger range of k^ than k±. 

The right panel of the figure (2) shows the SNR contours in the (k\\,k±) plane for 
the Ly-a forest 21-cm cross-correlation power spectrum. For the 21 cm signal, a 400 hrs 
observation is considered. We have considered a QSO number density of h = 30deg -2 , and 
the Ly-a spectra are assumed to be measured at a 2a sensitivity level. We use j3p to be 
1.11 and overall normalization factor Cp = —0.15 consistent with recent measurement [25]. 
The spectra is assumed to be smoothed to the same level as the frequency channel width 
of the 21 cm observations and the cross correlation is computed in the region of overlap 
between the two fields. Although the overall SNR for the cross correlation power spectrum is 
lower as compared to the 21 -cm auto correlation power spectrum, a 5 a detection is possible 
in the range 0.1 < k± < 0.4Mpc -1 and 0.1 < ku < IMpc -1 . The SNR peaks (> 10) at 
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Figure 3. SNR contours for the cross power spectrum in redshift space at the fiducial redshift z = 2.5 
for a 400hrs observation at 405MHz with 10% (left) and 100% (right) foreground residuals remaining 
in the 21-cm signal. 

rs j (0.2,0.3)Mpc 1 . We find that in the variance budget cr| (eq. 2.13) the second 
term always dominates over the first term and therefore the variance is essentially determined 
by the second term. In an ideal situation when the system noise for HI 21-cm (Nt) and 
Ly-a forest (IVjr) are zero or negligible, and n is large such that the term Pjp^(k»)P%P 
becomes very small, the two terms in the rhs. of eq. 2.13 become comparable. This is 
the cosmic variance limited. However, in practice the second term is always significantly 
larger than the first term except at very large scales. We further notice that the second 
term, which dominates the variance budget in almost all scales, is actually determined by 
the P$P^(k\\)P£P (which arises due to the discreteness of QSO sight-lines) and the system 
noise in HI 21-cm survey i.e, Nt- To be precise, we find that P]j£&(k\\)P£P dominates over 
.Pjrjr(k) for the scales (&q_, ku) > (0.05, 0.1) Mpc -1 . Similarly Nt is higher than Ptt for the 
scales (k±. fey) > (0.2,0.3) Mpc -1 . The variance can be reduced either by increasing the QSO 
number density or by increasing the observing time for HI 21-cm survey. The QSO number 
density considered here is already on the higher side for the BOSS like survey. Therefore, 
the only viable way to reduce the variance is to consider more observation time for HI 21-cm 
survey. 

It is important to note the role of foreground residuals in auto and cross-correlation. 
Whereas foregrounds appear as an inseparable contaminant to the auto-correlation signal, 
they appear only as a contribution to the noise in the cross correlation. In the computation of 
SNR for auto-correlation we have tacitly assumed that the foregrounds can be distinguished 
from the true cosmological signal. This is practically not possible and any foreground residual 
will plague the signal with additional power which has no cosmological significance. The 
figure 2 is in fact hypothetical since it assumes that foregrounds can be distinguished and 
completely separated from the signal. This issue is however not present in the cross signal. 
Any foreground residual in the 21 cm observation shall only degrade the noise and not affect 
the cross correlated signal. This is a key advantage of using cross-correlation as a cosmological 
probe as any detection here will ascertain the cosmological origin of the signal. 
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The figure 3 shows the cross-correlation in a more realistic scenario wherein we consider 
the inclusion of foreground residuals. Though the scale dependence of foreground residual 
may differ significantly from that of the signal we have modeled foreground residuals as 
merely 10% and 100% of the signal to get a rough order of magnitude estimate about the 
degradation of SNR in the presence of foreground residuals. We note that the dependence 
of the contours here significantly depends on the nature and clustering properties of the 
foregrounds and is also dependent on the foreground subtraction technique. We find that a 
peak SNR of 16 and 12 respectively can be achieved for these cases respectively. We find that 
the degradation of the peak SNR is not significant when the foreground residual is 10% of the 
signal. However, we note that (see the right panel of Figure 3) when foreground residuals are 
as large as 100% of the signal, there is approximately 35% reduction of the SNR as compared 
to the no-foreground case. The presence of a 100% foreground residuals will clearly inhibit 
its detection using the 21-crn auto-correlation power spectrum. However, we can see that a 
statistically significant detection with high SNR is possible using the cross-correlation even 
if the foreground residual is ~ 100% of the HI 21-cm signal. 

3.2 Constraining parameters 



Figure 4. Showing the 68.3%, 95.4% and 99.8% confidence contours for the parameters 
(A Ar, As?, R a)- 

We shall now consider the possibility of constraining various model parameters using 
the Fisher matrix analysis. Figure (4) shows the 68.3%, 95.4% and 99.8% confidence contours 
obtained using the Fisher matrix analysis for the parameters (A,/3t,P The table 1 
shows the 1 — a error for the above parameters. Owing to the smallness of the contribution 
of the redshift space distortion to the power spectrum, the parameters /3jr and fir are rather 
badly constrained and errors are very large. The parameters are constrained much 

better at (3.5%, 8%) respectively. The parameter A is proportional to bTXmCj?. Hence, the 
constraint on A is implicitly related to the constraint on the mean neutral fraction. The 
projections presented here are for a single field of view radio observation. 
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The full coverage of a typical Ly-a survey covers much larger volume and several beams 
of the radio observation shall fit in this. The noise scales as a/y/N where N is the number of 
pointings. Such enhancement of the volume shall allow the constraints to get much tighter. 
Figure 5 shows the marginalized one dimensional probability distribution function (PDF) for 
/3t and fip corresponding to 10 pointings. This gives an error on the parameters which are 
roughly of the same order of magnitude as the fiducial values itself. The marginalized errors 
can however be reduced by considering more pointings. 

Table 1. This shows 1 — a error on various parameters for a single field observation. 


Parameters Fiducial Value la Error la Error 

(marginalized) (conditional) 
fo R48 L06 004 

fo Til L55 005 



Pt Pf 


Figure 5. Figure shows the marginalized one dimensional probability distribution function (PDF) 
for /3p and f3p corresponding to 10 pointings. 

Till now we have assumed that all the four parameters (A, /3p, /3j?, Da) are unknown. In 
the estimation of the parameters we have treated them as four free parameters. We find that 
the constraints on (3p and /3jr are rather poor even if we consider 10 independent pointings of 
radio observations. We now consider error on each parameter assuming that the other three 
are known. This gives us the conditional error on each parameter. The conditional l — a 
error in (3p and /3jr are 8.5% and 4.5% respectively for single pointing radio observation. 
For 10 independent radio observations the conditional errors improve to 2.7%, 1.4%, 0.4% 
and 0.6% for f3p, /3jr, Da and A respectively. We note that these conditional errors give the 
best theoretical bounds on the parameters for the given observational specifications. These 
constraints obtained on redshift space distortion parameters f3 from our cross-correlation 
analysis is better as compared to the existing constraints [25, 37] and competitive with other 
cosmological probes aiming towards the same measurements. Further, higher density of 
QSOs and enhanced SNR for the individual QSO spectra shall also ensure more stringent 
constraints. 
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Discussion &c Conclusions 



Number of antenne 



Figure 6. The left and right panels show la conditional errors on parameters 8t and /3p as a 
function of QSO number density n (left) and number of antennae (right) for total observing time 
tabs = 400 hrs and n = 30 deg -2 respectively for a single field radio observation. We note that the 
total collecting area of the telescope increases linearly with increase in the number of antennae. All 
other parameters are kept fixed. 

Our analysis has so far focused on estimates obtained for specific HI 21-cm intensity 
mapping (with SKAl-mid) and Ly-a forest (with BOSS) experiments. We shall now discuss 
how these estimates vary for various other possible experiments. 

We first consider the effect of QSO number density on the constraints on the redshift 
space distortion parameters. Fig 6 (left) shows 1 a conditional errors on redshift space distor¬ 
tion parameters ftp and ftp as a function of QSO number density n for a total observing time 
t 0 b s = 400 hrs for a single field radio observation. All other observational parameters are held 
fixed. We find that there is a significant improvement in the constraints with increase in the 
QSO number density. This improvement in the constraints is expected to saturate at very 
high QSO number density. However, obtaining QSO number density n > 50 deg -2 with high 
SNR is unfeasible in near future and therefore we do not explore that possibility. The right 
panel of Fig. 6 shows la conditional errors on parameters fir and (3p as a function of number 
of antennae for n = 30deg -2 . The total collecting area of the telescope increases linearly 
with increase in the number of antennae. The SKA telescope is in design phase and any 
degradation of baseline distribution through reduction of the proposed number of antennae 
will affect the constraints on ftp and 8p according to the Figure 6. We find a considerable 
improvement in the constraints when the total number of antennae increases from 62 to 500, 
beyond which we hardly notice any improvement. Based on this, we argue that an increase in 
the number of antennae beyond 500 for the SKAl-mid like experiments is unlikely to improve 
constraints on redshift space distortion parameters. We note that the SKA telescope is in 
design phase and any degradation of baseline distribution through reduction of the proposed 
number of antennae will affect the constraints on /3p and ftp according to the Figure 6. 

An important feature of radio interferometer design is the array configuration which 
determines the baseline distribution / 2 d(H, v). In the previous section we considered the 
baseline distribution function for the SKAl-mid. We now consider three other baseline 
distribution corresponding to radially symmetric antenna distribution of the form ~ r~ n 
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Table 2. This shows 1 — a conditional error on various parameters for different baseline distributions. 


Parameters 

SKAl-mid 

n=l 

n=2 

n=3 

Pt 

0.04 

0.037 

0.032 

0.032 

P& 

0.05 

0.049 

0.044 

0.044 

Ha 

0.013 

0.012 

0.011 

0.011 

A 

0.002 

0.002 

0.0018 

0.0018 



Figure 7. The normalized baseline distribution function / 2 d(H, v). The solid curve shows the 
normalized baseline distribution for the SKAl-mid. The other three curves correspond to power law 
antenna distribution with power law indices n = 3,2,1 (top to bottom). 


with n = 1,2,3. We assume that the array has an uniform antenna distribution up to a 
radius of 80m. We also assume that all the antennae are confined within a radius of 1 km. 
Fig. 7 shows the normalized baseline distribution function for these cases. 

We wish to study the role of the compactness of the radio array on parameter estimation. 
The la conditional errors for these baseline distribution functions are summarized in Table 
2. We find that there is a general improvement of our results as compared to the SKAl-mid 
when all the antennae are confined within a radius of 1 km. We note that for the SKAl-mid, 
about 50% of the antennae are outside the 1 km core where as all antennae are confined 
within a radius of 1 km for the above three array designs. We find that there is a slight 
improvement in the constraints when the power law index is varied from n = 1 to 2. But, 
hardly any improvement is noticed when the index is changed from n = 2 to n = 3. 

It is clear that distributing antennae at large distances in such experiments (where 
resolution is not a prime concern ) is not advantageous since at large baselines the signal is 
sub-dominant. On the contrary it is useful to consider arrays in which most antennae packed 
within a small radius. However, we note that an arbitrary compactification (from n = 2 to 3) 
of the array causes us to lose k —modes which contribute to the signal and there is no further 
improvement in the SNR. 

It is important to choose the optimal observational strategy for the 21-cm signal. One 
may consider a deep observation (long time of observation) in a single field of view as opposed 
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to the possibility of dividing the same total observation time over many pointings. We know 
that on large scales the dominant contribution to noise comes from the cosmic variance. 
This can only be mitigated by increasing the survey volume. Deep observations in a small 
patch of the sky is not beneficial for reducing the cosmic variance. Observing multiple 
fields ensures that both the system noise contribution and the cosmic variance for the cross¬ 
correlation is reduced. The noise in this case is reduced on all scales. However, if one 
considers an observation for the same total time but in a single field of view, the system 
noise contribution indeed reaches the same minimum value but the noise which is dominated 
by the cosmic variance on large scales does not get reduced. It is thus strategically better 
for cross-correlation measurements to consider observations in multiple fields. 

The issue of foreground subtraction, though less severe for the cross-correlation is still 
a major concern for the 21-cm signal. Astrophysical foregrounds from galactic and extra 
galactic sources plague the signal [29] and significant amount of foreground subtraction is 
required before the cross-correlation is performed. The foregrounds appear as noise in the 
cross-correlation and may be tackled by considering larger number of Fourier modes (larger 
volumes). Similarly, for the Ly-a forest observations, continuum subtraction and avoiding 
metal line contamination, though less problematic, has to be performed with extreme pre¬ 
cision. Moreover, man made radio frequency interferences (RFIs), calibration errors and 
other systematics pose a serious threat to the detection of the HI 21-cm signal. A detailed 
analysis of these issues is outside the scope of the present work. We intent to address these 
observational aspects in a future work. 

Finally, this work emphasizes the important role of using cross correlation to bypass 
the fundamental problem posed by 21-cm foregrounds towards its detection through auto¬ 
correlation. We have shown that the 3D cross-correlation power spectrum from the post 
reionization epoch as a direct probe of cosmological structure formation can be detected to a 
high level of statistical sensitivity with telescopes such as the SKA. We have also focused on 
the possibility of constraining the bias (redshift space distortion) parameters for the 21-cm 
signal and the Ly-a forest. This investigation is crucial towards understanding the nature of 
HI bias in two distinct astrophysical systems under consideration namely Ly-a forest (diffuse 
low density HI in the IGM) and 21-cm signal (clumped HI in DLAs). These biases are 
studied extensively in numerical simulations. It is important for observational constraints to 
be compared with the simulation results to support their validity. Modeling of the IGM is 
incomplete without the precise knowledge about HI bias. In the absence of high quality 21-cm 
data it is important to make predictions based on future experiments. We find that strong 
constraints may be obtained on (/ 3t,@ 3 ?) from advanced BOSS and SKA like experiments. 

We conclude by noting that the cross-correlation of the Ly-a forest and the HI 21 cm 
signal as an independent probe of astrophysics and cosmology may allow us to put strong 
constraints on redshift space distortion parameters towards important understanding and 
modeling of the post reionization HI distribution. 
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